Efficient algorithms for decision tree cross-validation
نویسندگان
چکیده
Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is its computational overhead. In this paper we show that, for decision trees, the computational overhead of cross-validation can be reduced significantly by integrating the crossvalidation with the normal decision tree induction process. We discuss how existing decision tree algorithms can be adapted to this aim, and provide an analysis of the speedups these adaptations may yield. The analysis is supported by experimental results.
منابع مشابه
Eecient Algorithms for Decision Tree Cross-validation
Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is its computational overhead. In this paper we show that, for decision trees, the computational overhead of cross-validation can be reduced signiicantly by integrating the cross-valida...
متن کاملارائه مدلی برای پیشبینی نوع صافی همودیالیز با تکنیکهای دادهکاوی
Introduction: Inadequate dialysis for patients' kidneys as a mortality risk necessitates the presence of a pattern to assist staff in dialysate part to provide the proper services for dialysis patients and also the proper management of their treatment. Since the role of buffer type in the adequacy of dialysis is determinative, the present study is aimed at determining hemodialysis buffer type. ...
متن کاملCross-Validated C4.5: Using Error Estimation for Automatic Parameter Selection
Machine learning algorithms for supervised learning are in wide use. An important issue in the use of these algorithms is how to set the parameters of the algorithm. While the default parameter values may be appropriate for a wide variety of tasks, they are not necessarily optimal for a given task. In this paper, we investigate the use of cross-validation to select parameters for the C4.5 decis...
متن کاملEvaluation of Best First Decision Tree on Categorical Soil Survey Data for Land Capability Classification
Land capability classification (LCC) of a soil map unit is sought for sustainable use, management and conservation practices. High speed, high precision and simple generating of rules by machine learning algorithms can be utilized to construct pre-defined rules for LCC of soil map units in developing decision support systems for land use planning of an area. Decision tree (DT) is one of the mos...
متن کاملComparison of Machine Learning Algorithms for Broad Leaf Species Classification Using UAV-RGB Images
Abstract: Knowing the tree species combination of forests provides valuable information for studying the forest’s economic value, fire risk assessment, biodiversity monitoring, and wildlife habitat improvement. Fieldwork is often time-consuming and labor-required, free satellite data are available in coarse resolution and the use of manned aircraft is relatively costly. Recently, unmanned aeria...
متن کامل